首页> 外文OA文献 >Incomplete Dot Products for Dynamic Computation Scaling in Neural Network Inference
【2h】

Incomplete Dot Products for Dynamic Computation Scaling in Neural Network Inference

机译:用于神经网络动态计算尺度的不完全点积   网络推理

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

We propose the use of incomplete dot products (IDP) to dynamically adjust thenumber of input channels used in each layer of a convolutional neural networkduring feedforward inference. IDP adds monotonically non-increasingcoefficients, referred to as a "profile", to the channels during training. Theprofile orders the contribution of each channel in non-increasing order. Atinference time, the number of channels used can be dynamically adjusted totrade off accuracy for lowered power consumption and reduced latency byselecting only a beginning subset of channels. This approach allows for asingle network to dynamically scale over a computation range, as opposed totraining and deploying multiple networks to support different levels ofcomputation scaling. Additionally, we extend the notion to multiple profiles,each optimized for some specific range of computation scaling. We presentexperiments on the computation and accuracy trade-offs of IDP for popular imageclassification models and datasets. We demonstrate that, for MNIST andCIFAR-10, IDP reduces computation significantly, e.g., by 75%, withoutsignificantly compromising accuracy. We argue that IDP provides a convenientand effective means for devices to lower computation costs dynamically toreflect the current computation budget of the system. For example, VGG-16 with50% IDP (using only the first 50% of channels) achieves 70% in accuracy on theCIFAR-10 dataset compared to the standard network which achieves only 35%accuracy when using the reduced channel set.
机译:我们建议使用不完全点积(IDP)在前馈推理期间动态调整卷积神经网络各层中使用的输入通道数。 IDP在训练期间向通道添加单调非递增系数,称为“轮廓”。该配置文件以非递增顺序对每个通道的贡献进行排序。推断时间,通过仅选择通道的开始子集,可以动态调整使用的通道数,以牺牲精度来降低功耗和减少延迟。与训练和部署多个网络以支持不同级别的计算缩放相反,此方法允许单个网络在计算范围内动态缩放。此外,我们将概念扩展到多个配置文件,每个配置文件都针对特定的计算扩展范围进行了优化。我们介绍了针对流行的图像分类模型和数据集的IDP的计算和精度折衷的实验。我们证明,对于MNIST和CIFAR-10,IDP可显着减少计算量,例如减少75%,而不会显着影响准确性。我们认为IDP为设备提供了一种方便有效的手段,可以动态降低计算成本以反映系统当前的计算预算。例如,与标准网络相比,具有50%IDP(仅使用前50%的通道)的VGG-16在CIFAR-10数据集上的精度达到70%,而在使用精简通道集的情况下,其标准精度仅为35%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号